Skip to content

[BugFix]Fix eagle draft_model_config and add tests#31753

Merged
heheda12345 merged 7 commits intovllm-project:mainfrom
charlotte12l:fix_eagle
Jan 13, 2026
Merged

[BugFix]Fix eagle draft_model_config and add tests#31753
heheda12345 merged 7 commits intovllm-project:mainfrom
charlotte12l:fix_eagle

Conversation

@charlotte12l
Copy link
Copy Markdown
Contributor

@charlotte12l charlotte12l commented Jan 5, 2026

Purpose

EAGLE models uses EAGLEConfig to update hf_config architectures in draft_model_config. However, draft_model_config 's model_info and resolved architecture are not updated. Therefore, for eagle draft models, hf_config.architectures is EagleLlamaForCausalLM but draft_model_config.architecture is LlamaForCausalLM.

eagle_config = EAGLEConfig(
self.draft_model_config.hf_config,
method=self.method,
model_type="eagle",
)
self.draft_model_config.hf_config = eagle_config
self.draft_model_config.model_arch_config = (
self.draft_model_config.get_model_arch_config()
)

This PR updates all architectures-related fields in draft_model_config.
For some fields like runner_type, convert_type, we didn't update them because although they are depend on architectures, the dependency only holds if runner_type=auto. However draft models are using runner_type=draft.

Test Plan

python -m pytest tests/test_config.py::test_eagle_draft_model_config

Test Result

passed


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
@charlotte12l charlotte12l changed the title [BugFix]Fix eagle draft_model_config [BugFix]Fix eagle draft_model_config and add tests Jan 5, 2026
@charlotte12l charlotte12l marked this pull request as ready for review January 5, 2026 23:10
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug where updating the hf_config for an EAGLE draft model did not correctly propagate to other architecture-related fields in the draft_model_config. The changes correctly update hf_text_config, model_arch_config, _model_info, and _architecture to be consistent. A new test case is also added to verify this fix. My review includes one suggestion to refactor the implementation for better maintainability by encapsulating the re-initialization logic within the ModelConfig class, which would make the code less fragile to future changes.

Comment on lines +404 to +420
# EAGLEConfig primarily updates architectures, so update
# all architectures-related fields in draft_model_config
self.draft_model_config.hf_config = eagle_config
self.draft_model_config.hf_text_config = get_hf_text_config(
self.draft_model_config.hf_config
)
self.draft_model_config.model_arch_config = (
self.draft_model_config.get_model_arch_config()
)
model_info, arch = (
self.draft_model_config.registry.inspect_model_cls(
self.draft_model_config.architectures,
self.draft_model_config,
)
)
self.draft_model_config._model_info = model_info
self.draft_model_config._architecture = arch
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While this correctly fixes the issue, manually re-running parts of ModelConfig.__post_init__ here makes the code fragile. If the initialization logic in ModelConfig changes in the future, this code might become outdated and introduce subtle bugs.

To improve maintainability and encapsulation, I suggest adding a private helper method to ModelConfig to reset these architecture-dependent fields. This would also avoid accessing private members like _model_info and _architecture from outside the class.

You could add the following method to vllm/config/model.py (this file is not in the PR, so it would be an expansion of scope, but would improve the codebase):

# In vllm/config/model.py, class ModelConfig:
def _reset_architecture_fields(self):
    """
    Resets fields that are derived from the model architecture.
    This is useful when hf_config.architectures is modified after
    initialization.
    """
    from vllm.transformers_utils.config import get_hf_text_config
    self.hf_text_config = get_hf_text_config(self.hf_config)
    self.model_arch_config = self.get_model_arch_config()
    model_info, arch = self.registry.inspect_model_cls(
        self.architectures,
        self,
    )
    self._model_info = model_info
    self._architecture = arch

Then, you can simplify the code here as follows:

                        # EAGLEConfig primarily updates architectures, so update
                        # all architectures-related fields in draft_model_config
                        self.draft_model_config.hf_config = eagle_config
                        self.draft_model_config._reset_architecture_fields()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since _reset_architecture_fields was only used by eagle draft models, currently I keep those logics inside speculative.py. Happy to change if reviewer also agree with gemini

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think gemini's solution can solve the problem that people may forget to update this code. So I'm OK with current implementation.

)

# Replace hf_config for EAGLE draft_model
if self.method in ("eagle", "eagle3"):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does MTP have the same problem?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only saw eagle models updating hf_config, while MTP models didn't. But if it's MTP+EAGLE, like eagle618/eagle-deepseek-v3-random , then it will have the problem.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luccafong why don't we update the config for MTP?

Copy link
Copy Markdown
Collaborator

@luccafong luccafong Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heheda12345 AFAIK, since MTP reuses the same ckpt of target model and same hf config, so there's no need for a separate hf config (which most eagle models have)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Then I think this implementation is good.

heheda12345
heheda12345 approved these changes Jan 8, 2026
@heheda12345 heheda12345 enabled auto-merge (squash) January 8, 2026 05:35
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 8, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
auto-merge was automatically disabled January 8, 2026 19:23

Head branch was pushed to by a user without write access

charlotte12l and others added 2 commits January 8, 2026 11:25
@charlotte12l
Copy link
Copy Markdown
Contributor Author

charlotte12l commented Jan 8, 2026

After updating draft_model_config.hf_text_config accordingly in current PR, it triggers another buggy code:

  • Previously(before this PR), the below code will never be called as hf_text_config will never be eagle
    elif self.hf_text_config.model_type == "eagle":
    # if the model is an EAGLE module, check for the
    # underlying architecture
    return (
    self.hf_text_config.model.model_type
    in ("deepseek_v2", "deepseek_v3", "deepseek_v32")
    and self.hf_text_config.kv_lora_rank is not None
    )
    return False
    . For eagle618/eagle-deepseek-v3-random, is_deepseek_mla is True.
  • After this PR, for eagle618/eagle-deepseek-v3-random, is_deepseek_mla will be False as self.hf_text_config.model.model_type is deepseek_mtp.
  • My solution is updating ("deepseek_v2", "deepseek_v3", "deepseek_v32") to ("deepseek_v2", "deepseek_v3", "deepseek_v32", "deepseek_mtp"), let me know if this make sense to you @luccafong @heheda12345

Another weird PretrainedConfig issue I encountered during debugging the above:

  • At the above code, I do print(f"{self.hf_text_config.model=}") and print(f"{self.hf_text_config.model.model_type=}").
  • I found model_type in print(f"{self.hf_text_config.model=}") is deepseek_v3 while print(f"{self.hf_text_config.model.model_type=}") is deepseek_mtp.
  • I'm suspecting maybe __repr__ is not updating model_type correctly, is this expected? @hmellor

@heheda12345 heheda12345 merged commit 80221e1 into vllm-project:main Jan 13, 2026
48 of 49 checks passed
@heheda12345
Copy link
Copy Markdown
Collaborator

@hmellor

I'm suspecting maybe repr is not updating model_type correctly, is this expected?

Can you help to take a look? I merged this PR and hope @charlotte12l can help to fix this in a follow-up PR.

TomerBN-Nvidia pushed a commit to TomerBN-Nvidia/vllm that referenced this pull request Jan 13, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: Tomer Natan <tbarnatan@computelab-frontend-8.nvidia.com>
@charlotte12l
Copy link
Copy Markdown
Contributor Author

charlotte12l commented Jan 13, 2026

@heheda12345 @hmellor I might find the root cause but not sure if it's intended by huggingface:

This line explicitly uses self.class.model_type (the class attribute) when serializing, so it won't use the instance attribute model_type (where speculative decoding often override

if hf_config.model_type in ("deepseek_v3", "deepseek_v32"):
hf_config.model_type = "deepseek_mtp"
if hf_config.model_type == "deepseek_mtp":
n_predict = getattr(hf_config, "num_nextn_predict_layers", None)
hf_config.update(
{"n_predict": n_predict, "architectures": ["DeepSeekMTPModel"]}
)
if hf_config.model_type in ("pangu_ultra_moe"):
hf_config.model_type = "pangu_ultra_moe_mtp"
)

https://github.com/huggingface/transformers/blob/v5.0.0rc2/src/transformers/configuration_utils.py#L972

sammysun0711 pushed a commit to sammysun0711/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
akh64bit pushed a commit to akh64bit/vllm that referenced this pull request Jan 16, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants